[SOUND] This lecture is about
the Feedback in Text Retrieval.
So, in this lecture,
we're going to continue the discussion
on text retrieval methods.
In particular, we're going to talk
about Feedback in Text Retrieval.
This is a diagram that shows
the retrieval process.
We can see the user would
typed in a query and
then the query would be sent
to a Retrieval Engine or
search engine and
the engine would return results.
These results would be shown to the user.
After the user has seen these results,
the user can actually make judgments.
So for example, the user has say,
well, this is good and
this document is not very useful.
This is good again, et cetera.
Now this is called a relevance judgment or
Relevance Feedback, because we've
got some feedback information from
the user based on the judgments.
This can be very useful to the system.
Learn what exactly is
interesting to the user.
So the feedback module would
then take this as input and
also use the document collection
to try to improve ranking.
Typically, it would involve
updating the query.
So the system can now rank the results
more accurately for the user.
So this is called Relevance Feedback.
The feedback is based on relevance
judgements made by the users.
Now these judgements are reliable, but
the users generally don't want to make
extra effort, unless they have to.
So the downside's that involves
some extra effort by the user.
There is another form of feedback
called a Pseudo Relevance Feedback,
or a blind feedback also
called an automatic feedback.
In this case, you can see once
the user has got without an effect,
we don't have to involve users.
So you can see there's
no user involved here.
And we simply assume that the top
ranked documents to be relevant.
Let's say,
we have assumed the top ten is relevant.
And then we will then use these assumed
documents to learn and
to improve the query.
Now you might wonder, you know,
how could this help if we simply assume
the top rank documents would be random.
Well you can imagine these top rank
documents are actually similar to relevant
documents, even if they are not relevant,
they look like relevant documents.
So, it's possible to learn some related
terms to the query from this set.
In fact, you may recall that we
talked about using language model to
analyze word association to learn
related words to the word computer.
Right?
And then what we did is first,
use computer to retrieve all
the documents that contain computer.
So, imagine now the query
here is a computer.
Right?
And then the results will be those
documents that contain computer.
And what we can do then is
to take the top end results.
They can match computer very well and
we're going to count
the terms in this set and then we're
going to then use the background
language model to choose the terms
that are frequent the in this set,
but not frequent the in
the whole collection.
So, if we make a contrast between
these two, what we can find is that
we'll learn some related terms too, the
work computer as what I've seen before.
And these related words can then be added
to the original query to expand the query.
And this would help us free documents
that don't necessarily match computer,
but match other words like program and
software.
So this is factored for
improving the search doubt.
But of course, pseudo relevancy
feedback is completely unreliable.
We have to arbitrarily set a cutoff.
So there is also something in
between called Implicit Feedback.
In this case, what we do,
we do involve users, but
we don't have to ask
users to make judgements.
Instead, we are going to observe how the
user interacts with the search results.
So, in this case,
we're going to look at the clickthroughs.
So the user clicked on this one and
the, the user viewed this one.
And the user skipped this one and
the user viewed this one again.
Now this also is a clue about whether
a document is useful to the user and
we can even assume that we're going to use
only the snippet here in this document.
The text that's actually seen by the user,
instead of the actual document
of this entry in the link.
There that same web search may be broken,
but that, it doesn't matter.
If the user tries to fetch this document
that because of the displayed text,
we can assume this displayed text is
probably relevant is interesting to user,
so we can learn from such information.
And this is called Implicit Feedback and
we can again,
use the information to update the query.
This is a very important technique
used in modern search engines.
You know, think about Google and Bing and
they can collect a lot of user activities.
Why they are serverless?
Right.
So they would observe what documents we
click on, what documents we skip.
And this information is very valuable and
they can use this to
encode the search engine.
So to summarize,
we would talk about the three kinds of
feedback here rather than feedback.
Where the use exquisite judgement,
it takes some used effort, but
the judgement that
information is reliable.
We talked about the Pseudo Feedback, where
we simply assumed top random documents.
We get random,
we don't have to involve the user.
Therefore, we could do
that actually before we,
we return the results to the user.
And the third is Implicit Feedback,
where we use clickthroughs.
Where we don't, we involve users, but
the user doesn't have to make
explicit effort to make judgement.
[MUSIC]

